AITopics | context-aware model

Collaborating Authors

context-aware model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Context Matters: Leveraging Contextual Features for Time Series Forecasting

Chattopadhyay, Sameep, Paliwal, Pulkit, Narasimhan, Sai Shankar, Agarwal, Shubhankar, Chinchali, Sandeep P.

arXiv.org Artificial IntelligenceJan-13-2025

Time series forecasts are often influenced by exogenous contextual features in addition to their corresponding history. For example, in financial settings, it is hard to accurately predict a stock price without considering public sentiments and policy decisions in the form of news articles, tweets, etc. Though this is common knowledge, the current state-of-the-art (SOTA) forecasting models fail to incorporate such contextual information, owing to its heterogeneity and multimodal nature. To address this, we introduce ContextFormer, a novel plug-and-play method to surgically integrate multimodal contextual information into existing pre-trained forecasting models. ContextFormer effectively distills forecast-specific information from rich multimodal contexts, including categorical, continuous, time-varying, and even textual information, to significantly enhance the performance of existing base forecasters. ContextFormer outperforms SOTA forecasting models by up to 30% on a range of real-world datasets spanning energy, traffic, environmental, and financial domains.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2410.12672

Country:

North America > United States (0.67)
Asia (0.46)

Genre: Research Report > New Finding (0.46)

Industry:

Energy (1.00)
Banking & Finance > Trading (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

CODEOFCONDUCT at Multilingual Counterspeech Generation: A Context-Aware Model for Robust Counterspeech Generation in Low-Resource Languages

Bennie, Michael, Xiao, Bushi, Liu, Chryseis Xinyi, Zhang, Demi, Meng, Jian, Tripp, Alayo

arXiv.org Artificial IntelligenceJan-4-2025

This paper introduces a context-aware model for robust counterspeech generation, which achieved significant success in the MCG-COLING-2025 shared task. Our approach particularly excelled in low-resource language settings. By leveraging a simulated annealing algorithm fine-tuned on multilingual datasets, the model generates factually accurate responses to hate speech. We demonstrate state-of-the-art performance across four languages (Basque, English, Italian, and Spanish), with our system ranking first for Basque, second for Italian, and third for both English and Spanish. Notably, our model swept all three top positions for Basque, highlighting its effectiveness in low-resource scenarios. Evaluation of the shared task employs both traditional metrics (BLEU, ROUGE, BERTScore, Novelty) and JudgeLM based on LLM. We present a detailed analysis of our results, including an empirical evaluation of the model performance and comprehensive score distributions across evaluation metrics. This work contributes to the growing body of research on multilingual counterspeech generation, offering insights into developing robust models that can adapt to diverse linguistic and cultural contexts in the fight against online hate speech.

basque, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2501.00713

Country:

Europe (0.93)
North America > Mexico (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)

Add feedback

Analyzing the Attention Heads for Pronoun Disambiguation in Context-aware Machine Translation Models

Mąka, Paweł, Semerci, Yusuf Can, Scholtes, Jan, Spanakis, Gerasimos

arXiv.org Artificial IntelligenceDec-15-2024

In this paper, we investigate the role of attention heads in Context-aware Machine Translation models for pronoun disambiguation in the English-to-German and English-to-French language directions. We analyze their influence by both observing and modifying the attention scores corresponding to the plausible relations that could impact a pronoun prediction. Our findings reveal that while some heads do attend the relations of interest, not all of them influence the models' ability to disambiguate pronouns. We show that certain heads are underutilized by the models, suggesting that model performance could be improved if only the heads would attend one of the relations more strongly. Furthermore, we fine-tune the most promising heads and observe the increase in pronoun disambiguation accuracy of up to 5 percentage points which demonstrates that the improvements in performance can be solidified into the models' parameters.

artificial intelligence, natural language, relation, (15 more...)

arXiv.org Artificial Intelligence

2412.11187

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Portugal > Lisbon > Lisbon (0.14)
Europe > Italy > Tuscany > Florence (0.04)
(19 more...)

Genre: Research Report > New Finding (0.87)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Navigating Brain Language Representations: A Comparative Analysis of Neural Language Models and Psychologically Plausible Models

Zhang, Yunhao, Wang, Shaonan, Dong, Xinyi, Yu, Jiajun, Zong, Chengqing

arXiv.org Artificial IntelligenceApr-30-2024

Neural language models, particularly large-scale ones, have been consistently proven to be most effective in predicting brain neural activity across a range of studies. However, previous research overlooked the comparison of these models with psychologically plausible ones. Moreover, evaluations were reliant on limited, single-modality, and English cognitive datasets. To address these questions, we conducted an analysis comparing encoding performance of various neural language models and psychologically plausible models. Our study utilized extensive multi-modal cognitive datasets, examining bilingual word and discourse levels. Surprisingly, our findings revealed that psychologically plausible models outperformed neural language models across diverse contexts, encompassing different modalities such as fMRI and eye-tracking, and spanning languages from English to Chinese. Among psychologically plausible models, the one incorporating embodied information emerged as particularly exceptional. This model demonstrated superior performance at both word and discourse levels, exhibiting robust prediction of brain activation across numerous regions in both English and Chinese.

brain activation, representation, zhang, (12 more...)

arXiv.org Artificial Intelligence

2404.19364

Country: Asia > China > Beijing > Beijing (0.05)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.98)

Add feedback

MulCogBench: A Multi-modal Cognitive Benchmark Dataset for Evaluating Chinese and English Computational Language Models

Zhang, Yunhao, Zhang, Xiaohan, Li, Chong, Wang, Shaonan, Zong, Chengqing

arXiv.org Artificial IntelligenceMar-2-2024

Pre-trained computational language models have recently made remarkable progress in harnessing the language abilities which were considered unique to humans. Their success has raised interest in whether these models represent and process language like humans. To answer this question, this paper proposes MulCogBench, a multi-modal cognitive benchmark dataset collected from native Chinese and English participants. It encompasses a variety of cognitive data, including subjective semantic ratings, eye-tracking, functional magnetic resonance imaging (fMRI), and magnetoencephalography (MEG). To assess the relationship between language models and cognitive data, we conducted a similarity-encoding analysis which decodes cognitive data based on its pattern similarity with textual embeddings. Results show that language models share significant similarities with human cognitive data and the similarity patterns are modulated by the data modality and stimuli complexity. Specifically, context-aware models outperform context-independent models as language stimulus complexity increases. The shallow layers of context-aware models are better aligned with the high-temporal-resolution MEG signals whereas the deeper layers show more similarity with the high-spatial-resolution fMRI. These results indicate that language models have a delicate relationship with brain language representations. Moreover, the results between Chinese and English are highly consistent, suggesting the generalizability of these findings across languages.

cognitive data, fmri, language model, (17 more...)

arXiv.org Artificial Intelligence

2403.01116

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

Context-aware Neural Machine Translation for English-Japanese Business Scene Dialogues

Honda, Sumire, Fernandes, Patrick, Zerva, Chrysoula

arXiv.org Artificial IntelligenceNov-20-2023

Despite the remarkable advancements in machine translation, the current sentence-level paradigm faces challenges when dealing with highly-contextual languages like Japanese. In this paper, we explore how context-awareness can improve the performance of the current Neural Machine Translation (NMT) models for English-Japanese business dialogues translation, and what kind of context provides meaningful information to improve translation. As business dialogue involves complex discourse phenomena but offers scarce training resources, we adapted a pretrained mBART model, finetuning on multi-sentence dialogue data, which allows us to experiment with different contexts. We investigate the impact of larger context sizes and propose novel context tokens encoding extra-sentential information, such as speaker turn and scene type. We make use of Conditional Cross-Mutual Information (CXMI) to explore how much of the context the model uses and generalise CXMI to study the impact of the extra-sentential context. Overall, we find that models leverage both preceding sentences and extra-sentential context (with CXMI increasing with context size) and we provide a more focused analysis on honorifics translation. Regarding translation quality, increased source-side context paired with scene and speaker information improves the model performance compared to previous work and our context-agnostic baselines, measured in BLEU and COMET metrics.

computational linguistic, information, translation, (12 more...)

arXiv.org Artificial Intelligence

2311.11976

Country:

Europe > Germany > Brandenburg > Potsdam (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Asia > China > Hong Kong (0.04)
(7 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Learning to Reuse Distractors to support Multiple Choice Question Generation in Education

Bitew, Semere Kiros, Hadifar, Amir, Sterckx, Lucas, Deleu, Johannes, Develder, Chris, Demeester, Thomas

arXiv.org Artificial IntelligenceDec-13-2022

Multiple choice questions (MCQs) are widely used in digital learning systems, as they allow for automating the assessment process. However, due to the increased digital literacy of students and the advent of social media platforms, MCQ tests are widely shared online, and teachers are continuously challenged to create new questions, which is an expensive and time-consuming task. A particularly sensitive aspect of MCQ creation is to devise relevant distractors, i.e., wrong answers that are not easily identifiable as being wrong. This paper studies how a large existing set of manually created answers and distractors for questions over a variety of domains, subjects, and languages can be leveraged to help teachers in creating new MCQs, by the smart reuse of existing distractors. We built several data-driven models based on context-aware question and distractor representations, and compared them with static feature-based models. The proposed models are evaluated with automated metrics and in a realistic user test with teachers. Both automatic and human evaluations indicate that context-aware models consistently outperform a static feature-based approach. For our best-performing context-aware model, on average 3 distractors out of the 10 shown to teachers were rated as high-quality distractors. We create a performance benchmark, and make it public, to enable comparison between different approaches and to introduce a more standardized evaluation of the task. The benchmark contains a test of 298 educational questions covering multiple subjects & languages and a 77k multilingual pool of distractor vocabulary for future research.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TLT.2022.3226523

2210.13964

Country:

Europe > Sweden > Östergötland County > Linköping (0.04)
Africa > Ethiopia (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
(3 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
(3 more...)

Add feedback

RGRecSys: A Toolkit for Robustness Evaluation of Recommender Systems

Ovaisi, Zohreh, Heinecke, Shelby, Li, Jia, Zhang, Yongfeng, Zheleva, Elena, Xiong, Caiming

arXiv.org Artificial IntelligenceJan-12-2022

Robust machine learning is an increasingly important topic that focuses on developing models resilient to various forms of imperfect data. Due to the pervasiveness of recommender systems in online technologies, researchers have carried out several robustness studies focusing on data sparsity and profile injection attacks. Instead, we propose a more holistic view of robustness for recommender systems that encompasses multiple dimensions - robustness with respect to sub-populations, transformations, distributional disparity, attack, and data sparsity. While there are several libraries that allow users to compare different recommender system models, there is no software library for comprehensive robustness evaluation of recommender system models under different scenarios. As our main contribution, we present a robustness evaluation toolkit, Robustness Gym for RecSys (RGRecSys -- https://www.github.com/salesforce/RGRecSys), that allows us to quickly and uniformly evaluate the robustness of recommender system models.

library, proceedings, recommender system, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3488560.3502192

2201.04399

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
Asia > Middle East > Lebanon (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)

Add feedback

MCUa: Multi-level Context and Uncertainty aware Dynamic Deep Ensemble for Breast Cancer Histology Image Classification

Senousy, Zakaria, Abdelsamea, Mohammed M., Gaber, Mohamed Medhat, Abdar, Moloud, Acharya, U Rajendra, Khosravi, Abbas, Nahavandi, Saeid

arXiv.org Artificial IntelligenceAug-24-2021

Breast histology image classification is a crucial step in the early diagnosis of breast cancer. In breast pathological diagnosis, Convolutional Neural Networks (CNNs) have demonstrated great success using digitized histology slides. However, tissue classification is still challenging due to the high visual variability of the large-sized digitized samples and the lack of contextual information. In this paper, we propose a novel CNN, called Multi-level Context and Uncertainty aware (MCUa) dynamic deep learning ensemble model.MCUamodel consists of several multi-level context-aware models to learn the spatial dependency between image patches in a layer-wise fashion. It exploits the high sensitivity to the multi-level contextual information using an uncertainty quantification component to accomplish a novel dynamic ensemble model.MCUamodelhas achieved a high accuracy of 98.11% on a breast cancer histology image dataset. Experimental results show the superior effectiveness of the proposed solution compared to the state-of-the-art histology classification models.

classification, context-aware model, information, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TBME.2021.3107446

2108.10709

Country:

Asia > Singapore (0.14)
Africa > Middle East > Egypt (0.04)
Oceania > Australia (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (1.00)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision > Image Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Resonance: Replacing Software Constants with Context-Aware Models in Real-time Communication

Gupchup, Jayant, Aazami, Ashkan, Fan, Yaran, Filipi, Senja, Finley, Tom, Inglis, Scott, Asteborg, Marcus, Caroll, Luke, Chari, Rajan, Cozowicz, Markus, Gopal, Vishak, Prakash, Vinod, Bendapudi, Sasikanth, Gerrits, Jack, Lau, Eric, Liu, Huazhou, Rossi, Marco, Slobodianyk, Dima, Birjukov, Dmitri, Cooper, Matty, Javar, Nilesh, Perednya, Dmitriy, Srinivasan, Sriram, Langford, John, Cutler, Ross, Gehrke, Johannes

arXiv.org Artificial IntelligenceNov-22-2020

Large software systems tune hundreds of 'constants' to optimize their runtime performance. These values are commonly derived through intuition, lab tests, or A/B tests. A 'one-size-fits-all' approach is often sub-optimal as the best value depends on runtime context. In this paper, we provide an experimental approach to replace constants with learned contextual functions for Skype - a widely used real-time communication (RTC) application. We present Resonance, a system based on contextual bandits (CB). We describe experiences from three real-world experiments: applying it to the audio, video, and transport components in Skype. We surface a unique and practical challenge of performing machine learning (ML) inference in large software systems written using encapsulation principles. Finally, we open-source FeatureBroker, a library to reduce the friction in adopting ML models in such development environments

experiment, featurebroker, library, (15 more...)

arXiv.org Artificial Intelligence

2011.12715

Country: North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.96)

Add feedback